An Empirical Exploration of Skip Connections for Sequential Tagging
نویسندگان
چکیده
In this paper, we empirically explore the effects of various kinds of skip connections in stacked bidirectional LSTMs for sequential tagging. We investigate three kinds of skip connections connecting to LSTM cells: (a) skip connections to the gates, (b) skip connections to the internal states and (c) skip connections to the cell outputs. We present comprehensive experiments showing that skip connections to cell outputs outperform the remaining two. Furthermore, we observe that using gated identity functions as skip mappings works pretty well. Based on this novel skip connections, we successfully train deep stacked bidirectional LSTM models and obtain state-ofthe-art results on CCG supertagging and comparable results on POS tagging.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملDiscriminating copper geochemical anomalies and assessment of their reliability using a combination of sequential Gaussian simulation and gap statistic methods in Hararan area, Kerman, Iran
In geochemical exploration, there are various techniques such as univariate and multivariate statistical methods available for recognition of anomalous areas. Univariate techniques are usually utilized to estimate the threshold value, which is the smallest quantity among the values representing the anomalous areas. In this work, a combination of the Sequential Gaussian Simulation (SGS) and Gap ...
متن کاملVariable Activation Networks: a Simple Method to Train Deep Feed-forward Networks without Skip-connections
Novel architectures such as ResNets have enabled the training of very deep feedforward networks via the introduction of skip-connections, leading to state-of-theart results in many applications. Part of the success of ResNets has been attributed to improvements in the conditioning of the optimization problem (e.g., avoiding vanishing and shattered gradients). In this work we propose a simple me...
متن کاملDeep-FSMN for Large Vocabulary Continuous Speech Recognition
In this paper, we present an improved feedforward sequential memory networks (FSMN) architecture, namely Deep-FSMN (DFSMN), by introducing skip connections between memory blocks in adjacent layers. These skip connections enable the information flow across different layers and thus alleviate the gradient vanishing problem when building very deep structure. As a result, DFSMN significantly benefi...
متن کاملThe Shattered Gradients Problem: If resnets are the answer, then what is the question?
A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. Although, the problem has largely been overcome via carefully constructed initializations and batch normalization, architectures incorporating skip-connections such as highway and resnets perform much better than standard feedforward architectures despite wellchosen initialization and batc...
متن کامل